Date

2023-02-09 02:44| 来源: 网络整理| 查看: 265

Date-time differences between rows in R

Try this (I am assuming that you have your data in a data.frame called mydf) and that you want the difference between the first time stamp and all subsequent timestamps:

c_time % mutate(time_diff_2 = as.numeric(Cl_Date-lag(Cl_Date), units = 'mins'))

Convert the time difference to a numeric value. You can use units argument to make the return values consistent.

How to calculate time difference in consecutive rows

When you just add default = strptime(v_time, "%d/%m/%Y %H:%M")[1] to the lag part:

df % arrange(visitor, v_time) %>% group_by(visitor) %>% mutate(diff = strptime(v_time, "%d/%m/%Y %H:%M") - lag(strptime(v_time, "%d/%m/%Y %H:%M"), default = strptime(v_time, "%d/%m/%Y %H:%M")[1]), diff_secs = as.numeric(diff, units = 'secs'))

you get the result you expect:

> df# A tibble: 8 x 6# Groups: visitor [3] visitor v_time payment items diff diff_secs 1 David 1/2/2018 16:12 25. 2. 0 0.2 David 1/2/2018 16:21 25. 5. 540 540.3 Jack 1/2/2018 16:07 35. 3. 0 0.4 Jack 1/2/2018 16:09 160. 1. 120 120.5 Jack 1/2/2018 16:32 85. 5. 1380 1380.6 Jack 1/2/2018 16:55 6. 2. 1380 1380.7 Kate 1/2/2018 16:16 3. 3. 0 0.8 Kate 1/2/2018 16:33 639. 3. 1020 1020.

Another option is to use difftime:

df % arrange(visitor, v_time) %>% group_by(visitor) %>% mutate(diff = difftime(strptime(v_time, "%d/%m/%Y %H:%M"), lag(strptime(v_time, "%d/%m/%Y %H:%M"), default = strptime(v_time, "%d/%m/%Y %H:%M")[1]), units = 'mins'), diff_secs = as.numeric(diff, units = 'secs'))

now the diff-column is in minutes and the diff_sec-column is in seconds:

> df# A tibble: 8 x 6# Groups: visitor [3] visitor v_time payment items diff diff_secs 1 David 1/2/2018 16:12 25. 2. 0 0.2 David 1/2/2018 16:21 25. 5. 9 540.3 Jack 1/2/2018 16:07 35. 3. 0 0.4 Jack 1/2/2018 16:09 160. 1. 2 120.5 Jack 1/2/2018 16:32 85. 5. 23 1380.6 Jack 1/2/2018 16:55 6. 2. 23 1380.7 Kate 1/2/2018 16:16 3. 3. 0 0.8 Kate 1/2/2018 16:33 639. 3. 17 1020.

You can now save the result again with write.csv(df,"C:/output.csv", row.names = FALSE)

Calculate the difference in time between two dates and add them to a new column

You need to make some changes in your code.

First and foremost, don't use $ in dplyr pipes. Pipes (%>%) were created to avoid using df$column_name everytime you want to use variable from the dataframe. Using $ can have unintended consequences when grouping the data or using rowwise as you can see in your case.

Secondly, difftime is vectorised so no need of rowwise here.

Finally, if you want time difference in minutes you should change the values to POSIXct type and not dates. Try the following -

library(dplyr)

df % mutate(trip_duration = difftime(as.POSIXct(`end time`), as.POSIXct(`start time`), units = "mins"))

R Difference in time between rows

You can use lag and difftime (per Hadley):

df %>% mutate(time = as.POSIXct(start, format = "%m/%d/%y %H:%M")) %>% group_by(id) %>% mutate(diff = difftime(time, lag(time)))

# A tibble: 6 x 4# Groups: id [2] id start time diff 1 1. 1/31/17 10:00 2017-01-31 10:00:00 2 1. 1/31/17 10:02 2017-01-31 10:02:00 2 3 1. 1/31/17 10:45 2017-01-31 10:45:00 43 4 2. 2/10/17 12:00 2017-02-10 12:00:00 5 2. 2/10/17 12:20 2017-02-10 12:20:00 20 6 2. 2/11/17 09:40 2017-02-11 09:40:00 1280

How to find time difference between previous and following rows from specific rows

Using fuzzyjoin might be useful here:

library(dplyr)library(fuzzyjoin)

df_grp % filter(start == "yes") %>% select(time) %>% group_by(grp = row_number()) %>% mutate(begin = time - 5, end = time + 5)

First we create a data.frame of your initial values with -5 and +5 values:

# A tibble: 2 x 4 time grp begin end 1 2.82 1 -2.17 7.822 16.8 2 11.8 21.8

Next we use a fuzzy_join to attach it to the original data.frame and calculate the differences:

df %>% fuzzy_left_join(df_grp, by = c("time" = "begin", "time" = "end"), match_fun = list(`>`, `% group_by(grp) %>% mutate(diff = time.x - time.y) %>% ungroup()

This returns

# A tibble: 14 x 8 initiate start time.x time.y grp begin end diff 1 0 no 2.82 2.82 1 -2.17 7.82 -0.00250 2 0 no 2.82 2.82 1 -2.17 7.82 -0.00125 3 1 yes 2.82 2.82 1 -2.17 7.82 0 4 1 no 2.83 2.82 1 -2.17 7.82 0.00125 5 1 no 2.83 2.82 1 -2.17 7.82 0.00200 6 1 no 2.83 2.82 1 -2.17 7.82 0.00225 7 0 no 16.8 16.8 2 11.8 21.8 -0.0137 8 0 no 16.8 16.8 2 11.8 21.8 -0.0112 9 0 no 16.8 16.8 2 11.8 21.8 -0.0012010 1 yes 16.8 16.8 2 11.8 21.8 0 11 1 no 16.8 16.8 2 11.8 21.8 0.0038012 0 no 16.8 16.8 2 11.8 21.8 0.0050013 1 no 16.8 16.8 2 11.8 21.8 0.0063014 1 no 16.8 16.8 2 11.8 21.8 0.00880 R calculating time differences in a (layered) long dataset

Using base R (no extra packages):

sort the data, ordering by customer Id, then by timestamp.calculate the time difference between consecutive rows (using the diff() function), grouping by customer id (tapply() does the grouping).find the averagesquish that into a data.frame.# 1 sort the datadf$Timestamp

【本文地址】

公司简介

联系我们